35 research outputs found

    Segmentation uncertainty estimation as a sanity check for image biomarker studies

    Get PDF
    SIMPLE SUMMARY: Radiomics is referred to as quantitative image biomarker analysis. Due to the uncertainty in image acquisition, processing, and segmentation (delineation) protocols, the radiomic biomarkers lack reproducibility. In this manuscript, we show how this protocol-induced uncertainty can drastically reduce prognostic model performance and propose some insights on how to use it for developing better prognostic models. ABSTRACT: Problem. Image biomarker analysis, also known as radiomics, is a tool for tissue characterization and treatment prognosis that relies on routinely acquired clinical images and delineations. Due to the uncertainty in image acquisition, processing, and segmentation (delineation) protocols, radiomics often lack reproducibility. Radiomics harmonization techniques have been proposed as a solution to reduce these sources of uncertainty and/or their influence on the prognostic model performance. A relevant question is how to estimate the protocol-induced uncertainty of a specific image biomarker, what the effect is on the model performance, and how to optimize the model given the uncertainty. Methods. Two non-small cell lung cancer (NSCLC) cohorts, composed of 421 and 240 patients, respectively, were used for training and testing. Per patient, a Monte Carlo algorithm was used to generate three hundred synthetic contours with a surface dice tolerance measure of less than 1.18 mm with respect to the original GTV. These contours were subsequently used to derive 104 radiomic features, which were ranked on their relative sensitivity to contour perturbation, expressed in the parameter η. The top four (low η) and the bottom four (high η) features were selected for two models based on the Cox proportional hazards model. To investigate the influence of segmentation uncertainty on the prognostic model, we trained and tested the setup in 5000 augmented realizations (using a Monte Carlo sampling method); the log-rank test was used to assess the stratification performance and stability of segmentation uncertainty. Results. Although both low and high η setup showed significant testing set log-rank p-values (p = 0.01) in the original GTV delineations (without segmentation uncertainty introduced), in the model with high uncertainty, to effect ratio, only around 30% of the augmented realizations resulted in model performance with p < 0.05 in the test set. In contrast, the low η setup performed with a log-rank p < 0.05 in 90% of the augmented realizations. Moreover, the high η setup classification was uncertain in its predictions for 50% of the subjects in the testing set (for 80% agreement rate), whereas the low η setup was uncertain only in 10% of the cases. Discussion. Estimating image biomarker model performance based only on the original GTV segmentation, without considering segmentation, uncertainty may be deceiving. The model might result in a significant stratification performance, but can be unstable for delineation variations, which are inherent to manual segmentation. Simulating segmentation uncertainty using the method described allows for more stable image biomarker estimation, selection, and model development. The segmentation uncertainty estimation method described here is universal and can be extended to estimate other protocol uncertainties (such as image acquisition and pre-processing)

    Precision-medicine-toolbox: An open-source python package for facilitation of quantitative medical imaging and radiomics analysis

    Full text link
    [en] Medical image analysis plays a key role in precision medicine as it allows the clinicians to identify anatomical abnormalities and it is routinely used in clinical assessment. Data curation and pre-processing of medical images are critical steps in the quantitative medical image analysis that can have a significant impact on the resulting model performance. In this paper, we introduce a precision-medicine-toolbox that allows researchers to perform data curation, image pre-processing and handcrafted radiomics extraction (via Pyradiomics) and feature exploration tasks with Python. With this open-source solution, we aim to address the data preparation and exploration problem, bridge the gap between the currently existing packages, and improve the reproducibility of quantitative medical imaging research

    Predicting Adverse Radiation Effects in Brain Tumors After Stereotactic Radiotherapy With Deep Learning and Handcrafted Radiomics

    Full text link
    Introduction There is a cumulative risk of 20-40% of developing brain metastases (BM) in solid cancers. Stereotactic radiotherapy (SRT) enables the application of high focal doses of radiation to a volume and is often used for BM treatment. However, SRT can cause adverse radiation effects (ARE), such as radiation necrosis, which sometimes cause irreversible damage to the brain. It is therefore of clinical interest to identify patients at a high risk of developing ARE. We hypothesized that models trained with radiomics features, deep learning (DL) features, and patient characteristics or their combination can predict ARE risk in patients with BM before SRT. Methods Gadolinium-enhanced T1-weighted MRIs and characteristics from patients treated with SRT for BM were collected for a training and testing cohort (N = 1,404) and a validation cohort (N = 237) from a separate institute. From each lesion in the training set, radiomics features were extracted and used to train an extreme gradient boosting (XGBoost) model. A DL model was trained on the same cohort to make a separate prediction and to extract the last layer of features. Different models using XGBoost were built using only radiomics features, DL features, and patient characteristics or a combination of them. Evaluation was performed using the area under the curve (AUC) of the receiver operating characteristic curve on the external dataset. Predictions for individual lesions and per patient developing ARE were investigated. Results The best-performing XGBoost model on a lesion level was trained on a combination of radiomics features and DL features (AUC of 0.71 and recall of 0.80). On a patient level, a combination of radiomics features, DL features, and patient characteristics obtained the best performance (AUC of 0.72 and recall of 0.84). The DL model achieved an AUC of 0.64 and recall of 0.85 per lesion and an AUC of 0.70 and recall of 0.60 per patient. Conclusion Machine learning models built on radiomics features and DL features extracted from BM combined with patient characteristics show potential to predict ARE at the patient and lesion levels. These models could be used in clinical decision making, informing patients on their risk of ARE and allowing physicians to opt for different therapies

    CT Reconstruction Kernels and the Effect of Pre- and Post-Processing on the Reproducibility of Handcrafted Radiomic Features.

    Full text link
    peer reviewedHandcrafted radiomics features (HRFs) are quantitative features extracted from medical images to decode biological information to improve clinical decision making. Despite the potential of the field, limitations have been identified. The most important identified limitation, currently, is the sensitivity of HRF to variations in image acquisition and reconstruction parameters. In this study, we investigated the use of Reconstruction Kernel Normalization (RKN) and ComBat harmonization to improve the reproducibility of HRFs across scans acquired with different reconstruction kernels. A set of phantom scans (n = 28) acquired on five different scanner models was analyzed. HRFs were extracted from the original scans, and scans were harmonized using the RKN method. ComBat harmonization was applied on both sets of HRFs. The reproducibility of HRFs was assessed using the concordance correlation coefficient. The difference in the number of reproducible HRFs in each scenario was assessed using McNemar's test. The majority of HRFs were found to be sensitive to variations in the reconstruction kernels, and only six HRFs were found to be robust with respect to variations in reconstruction kernels. The use of RKN resulted in a significant increment in the number of reproducible HRFs in 19 out of the 67 investigated scenarios (28.4%), while the ComBat technique resulted in a significant increment in 36 (53.7%) scenarios. The combination of methods resulted in a significant increment in 53 (79.1%) scenarios compared to the HRFs extracted from original images. Since the benefit of applying the harmonization methods depended on the data being harmonized, reproducibility analysis is recommended before performing radiomics analysis. For future radiomics studies incorporating images acquired with similar image acquisition and reconstruction parameters, except for the reconstruction kernels, we recommend the systematic use of the pre- and post-processing approaches (respectively, RKN and ComBat)

    Automated detection and segmentation of non-small cell lung cancer computed tomography images.

    Full text link
    peer reviewedDetection and segmentation of abnormalities on medical images is highly important for patient management including diagnosis, radiotherapy, response evaluation, as well as for quantitative image research. We present a fully automated pipeline for the detection and volumetric segmentation of non-small cell lung cancer (NSCLC) developed and validated on 1328 thoracic CT scans from 8 institutions. Along with quantitative performance detailed by image slice thickness, tumor size, image interpretation difficulty, and tumor location, we report an in-silico prospective clinical trial, where we show that the proposed method is faster and more reproducible compared to the experts. Moreover, we demonstrate that on average, radiologists & radiation oncologists preferred automatic segmentations in 56% of the cases. Additionally, we evaluate the prognostic power of the automatic contours by applying RECIST criteria and measuring the tumor volumes. Segmentations by our method stratified patients into low and high survival groups with higher significance compared to those methods based on manual contours

    Can predicting COVID-19 mortality in a European cohort using only demographic and comorbidity data surpass age-based prediction: An externally validated study.

    Full text link
    peer reviewedOBJECTIVE: To establish whether one can build a mortality prediction model for COVID-19 patients based solely on demographics and comorbidity data that outperforms age alone. Such a model could be a precursor to implementing smart lockdowns and vaccine distribution strategies. METHODS: The training cohort comprised 2337 COVID-19 inpatients from nine hospitals in The Netherlands. The clinical outcome was death within 21 days of being discharged. The features were derived from electronic health records collected during admission. Three feature selection methods were used: LASSO, univariate using a novel metric, and pairwise (age being half of each pair). 478 patients from Belgium were used to test the model. All modeling attempts were compared against an age-only model. RESULTS: In the training cohort, the mortality group's median age was 77 years (interquartile range = 70-83), higher than the non-mortality group (median = 65, IQR = 55-75). The incidence of former/active smokers, male gender, hypertension, diabetes, dementia, cancer, chronic obstructive pulmonary disease, chronic cardiac disease, chronic neurological disease, and chronic kidney disease was higher in the mortality group. All stated differences were statistically significant after Bonferroni correction. LASSO selected eight features, novel univariate chose five, and pairwise chose none. No model was able to surpass an age-only model in the external validation set, where age had an AUC of 0.85 and a balanced accuracy of 0.77. CONCLUSION: When applied to an external validation set, we found that an age-only mortality model outperformed all modeling attempts (curated on www.covid19risk.ai) using three feature selection methods on 22 demographic and comorbid features

    Beyond automatic medical image segmentation - the spectrum between fully manual and fully automatic delineation

    No full text
    Semi-automatic and fully automatic contouring tools have emerged as an alternative to fully manual segmentation to reduce time spent contouring and to increase contour quality and consistency. Particularly, fully automatic segmentation has seen exceptional improvements through the use of deep learning in recent years. These fully automatic methods may not require user interactions, but the resulting contours are often not suitable to be used in clinical practice without a review by the clinician. Furthermore, they need large amounts of labeled data to be available for training. This review presents alternatives to manual or fully automatic segmentation methods along the spectrum of variable user interactivity and data availability. The challenge lies to determine how much user interaction is necessary and how this user interaction can be used most effectively. While deep learning is already widely used for fully automatic tools, interactive methods are just at the starting point to be transformed by it. Interaction between clinician and machine, via artificial intelligence, can go both ways and this review will present the avenues that are being pursued to improve medical image segmentation

    Precision-medicine-toolbox:An open-source python package for the quantitative medical image analysis

    No full text
    peer reviewedMedical image analysis plays a key role in precision medicine. Data curation and pre-processing are critical steps in quantitative medical image analysis that can have a significant impact on the resulting performance of machine learning models. In this work, we introduce the Precision-medicine-toolbox, allowing clinical and junior researchers to perform data curation, image pre-processing, radiomics extraction, and feature exploration tasks with a customizable Python package. With this open-source tool, we aim to facilitate the crucial data preparation and exploration steps, bridge the gap between the currently existing packages, and improve the reproducibility of quantitative medical imaging research
    corecore